2025-01-16 15:46:26.AIbase.
Alibaba Cloud Launches New Mathematical Reasoning Model Qwen2.5-Math-PRM, 7B Version Surpasses GPT-4o
2025-01-16 10:42:26.AIbase.
Alibaba Qwen Team Releases New Process Reward Model, Advancing Mathematical Reasoning
2024-12-15 10:23:35.AIbase.
Ali Launches New AI Benchmark 'PROCESSBENCH' to Assess Error Recognition Capability in Mathematical Reasoning
2024-11-29 09:47:51.AIbase.
Devastating Loss! Epoch AI Launches New Mathematics Benchmark FrontierMath, Top AI Models Solve Less Than 2%
2024-11-18 07:58:19.AIbase.
Kimi Launches Mathematical Reasoning Model k0-math: Math Capabilities Benchmarking Against OpenAI's o1 Series
2024-10-14 14:51:30.AIbase.
Apple Research Team Releases New Benchmark GSM-Symbolic: Revealing the Mathematical Reasoning Limitations of Large Language Models!
2024-10-12 14:59:01.AIbase.